The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


237

 The efficiency and correctness of continuous Arabic  Speech Recognition Systems (ARS) hinge on the accu racy of the  language  phoneme  set.  The  main  goal  of  this  researc h  is  to  recognize  and  transcribe  Arabic  phonemes  using  a  data-driven  approach. We used the Hidden Markov Toolkit (HTK) t o develop a phoneme recognizer, carrying out several experiments with  different  parameters,  such  as  varying  number  of  Hid den  Markov  Model  (HMM)  states  and  Gaussian  mixtures   to  model  the  Arabic  phonemes  and  find  the  best  configuration.  We   used  a  corpus  consisting  of  about  4000  files,  representing  5  recorded  hours of Modern Standard Arabic (MSA) of TV-News. A  statistical analysis for the phonemes length, frequency and mode was  carried  out,  in  order  to  determine  the  best  number  of  states  necessary  to  represent  each  phoneme.  Phon eme  recognition  accuracy of 56.79% was reached without using a lang uage model. The recognition accuracy increased to 96.3% upon using a  bigram language model.  


[1] Abushariah M., Ainon R., Zainuddin R., ElshafeiM., and Khalifa O., Arabic Speaker Independent Continuous Automatic Speech Recognition based on a Phonetically Rich and Balanced Speech Corpus, the International Arab Journal of Information Technology , vol. 9, no.1, pp. 84 93, 2012.

[2] Abuzeina D., Al Khatib G., Elshafei M., and Al Muhtaseb H., Within Word Pronunciation Variation Modeling for Arabic ASRs: A Direct Data Driven Approach, International Journal of Speech Technology , vol. 15, no. 2, pp. 65 75, 2012.

[3] Alghamdi M., Elshafei M., and Al Muhtaseb H., Arabic Broadcast News Transcription System, International Journal of Speech Technology , vol. 10, no. 4, pp. 183 195, 2007.

[4] Bayeh R., Lin S., Chollet G., and Mokbel C., Towards Multilingual Speech Recognition using Data Driven Source/Target Acoustical Units Association, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing , Montreal, Canada, pp. 521 524, 2004.

[5] Elmahdy M., Gruhn R., Abdennadher S., and Minker W., Rapid Phonetic Transcription using Everyday Life Natural Chat Alphabet Orthography for Dialectal Arabic Speech Recognition, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing , Prague, Czech Republic, pp. 4936 4939, 2011.

[6] http ://www.info2.uqam.ca/~boukadoum_m/DIC9 315/Notes/Markov/HTK_basic_tutorial.pdf.

[7] Kim Y., Chan Y., Evermann G., Gales F., Mrva D., Sim C., and Woodland C., Development of the CU HTK 2004 Broadcast News Transcription Systems, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing , Philadelphia, USA, pp. 861 864, 2005.

[8] Lee F. and Hon W., Speaker independent Phone Recognition using Hidden Markov Models, IEEE Transactions on Acoustics, Speech and Signal Processing , vol. 37, no. 11, pp. 1641 1648, 1989.

[9] Levinson E., Liberman Y., Ljolje A., and Miller G., Speaker Independent Phonetic Transcription of Fluent Speech for Large Vocabulary Speech Recognition, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing , Glasgow, Scotland, pp 441 444, 1989.

[10] Liang S., Lyu Y., and Chiang C., Using Speech Recognition Technique for Constructing a Phonetically Transcribed Taiwanese (Min Nan) Text Corpus, in Proceedings of The 9 th International Conference on Spoken Language Processing , Pittsburgh, USA, pp. 193 196, 2006.

[11] Nahar K., Elshafei M., Al Khatib G., Al Muhtaseb H., and Alghamdi M., Statistical Analysis of Arabic Phonemes for Continuous Arabic Speech Recognition, International Journal of Computer and Information Technology , vol. 1, no. 2, pp. 49 61, 2012.

[12] Young J., Evermann G., Gales F., Hain T., Kershaw D., Moore G., Odell J., Ollason D., Povey D., Valtchev V., and Woodland C., The HTK Book (Version 3.4)., available at: http://speech.ee.ntu.edu.tw/homework/DSP_HW 2 1/htkbook.pdf, last visited 2006. Arabic Phonemes Transcription Using Data Driven Approach 245 Khalid Nahar assistant professor Computer Science Department Tabuk University, KSA (Unpaid Leave) From Yarmouk University, Jordan.He is an assistant professor at Tabuk University, Saudi Arabia. He received his BS and MS degrees in computer science from Yarmouk University Jordan in 1992 and 2005 respectively. He received his PhD in Computer Science and Engineering from King Fahd University of Petroleum and Minerals (KFUPM). He worked at Yarmouk University as Teacher Research Assistant for 7 years and 5 years as a lecturer, for now he is working at Tabuk University KSA. His research interests include continuous speech recognition, arabic computing, natural language processing, multimedia computing, content based retrieval, artificial intelligence, and software engineering. He particip ated in funded research projects. Husni Al-Muhtaseb Assistant Professor, Computer Science Department Husni Al Muhtaseb Obtained a PhD degree from the Department of Electronic Imaging and Media Communications (EIMC) of the School of Informatics in the University of Bradford, UK in 2010. He received his M.S. degree in computer science and engineering fro m King Fahd University of Petroleum and Minerals (KFUPM), Dhahran, Saudi Arabia, in 1988 and the B.E. degree in electrical engineering, computer opt ion, from Yarmouk University, Irbid, Jordan in 1984. He is currently an Assistant Professor of Information and Computer Science at KFUPM. He was working as an Instructor with the same department from 1992 to 2010. He worked as a technical consultant with the dean of admissions and registration from 1996 to 20 07. From 1988 to 1992, he worked as lecturer at KFUPM. From 1984 to 1988, he worked as Research and Teaching Assistant at Yarmouk University and KFUPM. His research interests include software development, Arabic Computing, computer Arabization, Arabic OCR, e learning & online tutori ng and natural Arabic understanding. he developed the first course in the world on Arabization of Compute rs. Currently, several Universities and colleges are adapting the course. He has participated in several industrial projects with different institutes/ organizations including, KACST, STC, MOHE and Aramco. He also worked as a consultant for differen t entities including KFUPM schools and Ministry of education. He has more than 60 research publication s. He got the first excellence award in instructional Technologies at KFUPM for year 2007. Wasfi Al-Khatib Assistant Professor King Fahd University of Petroleum and Minerals, Saudi Arabia. He received his BS degree in computer science from Kuwait University in 1990, and his MS degree in computer science and PhD in Electrical and Computer Engineering from Purdue University in 1995 and 2001, respectively. He worke d at Wright State University in Dayton, Ohio as an assistant professor from 2001 2002. His research interests include Arabic computing, multimedia computing, content based retrieval, artificial intelligence, and software engineering. He particip ated in funded research projects and supervised many graduated students. He also participated in curricu lum development and ABET assessment accreditation efforts, both at the department level and at the University level. He is a member of the ACM and the IEEE Computer Society. Moustafa Elshafei Professor, Systems Engineering Department he received his PhD (with Dean List) from McGill University, Canada, in electrical engineering in 1982. Since then, he has accumulated over 31 years of both academic and industrial experience. He is sole inventor/co inve ntor of 13 US and international patents. He has over 150 publications in international journals, conferences , and technical reports. He was the PI/CI of many government funded projects exceeding 8 million SR, and he was also involved in many internally funded or industry funded projects. His research interest inc ludes Arabic speech processing, digital signal processing , and intelligent instrumentation. Mansour Alghamdi, Professor in Phonetics, King Abdul Aziz City for Science and Technology, Saudi Arabia. he gained a PhD in phonetics at Reading University, UK, in 1990. He has more than 80 published books, journal papers and conference presentations, has 3 patents, participat ed in more than 20 scientific projects at KACST, KFUPM and KSU as the principle investigator or co investigator, lectured in several institutions on t he applied phonetics including: computational linguist ics, speech therapy, translation, first language acquisi tion and foreign language learning. He is now the direct or of General Directorate of Scientific Awareness and Publishing and the director of the National Program for Digital Content at King Abdul Aziz City for Science and Technology, Riyadh.